Multivariate distribution of returns in financial time series 
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Abstract 

Multivariate probability density functions of returns are constructed in or- 
der to model the empirical behavior of returns in a financial time series. 
They describe the well-established deviations from the Gaussian random walk, 
such as an approximate scaling and heavy tails of the return distributions, 
long-ranged volatility-volatility correlations (volatility clustering) and return- 
volatility correlations (leverage effect). Free parameters of the model are fixed 
over the long term by fitting 100+ years of daily prices of the Dow Jones 30 
Industrial Average. The multivariate probability density functions which we 
have constructed can be used for pricing derivative securities and risk man- 
agement. 
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The methods developed in studying complex physical systems have been successfully ap- 
plied throughout decades to analize financial data [1-3] and they continue to attract gradual 
interest [4-11]. The field of research connected to modeling financial markets and develop- 
ment of statistically based real-time decision systems has recently been named Econophysics. 
The properties of certain observables of different financial markets appear to be quite simi- 
lar, indicating the existence of universal driving mechanisms behind market evolution. The 
understanding of the dynamics of complex financial systems has an influence on the deci- 
sions of market participants and on the market behavior and represents a scientific challenge 
for research. The methods of Econophysics have made previous progress connected to the 
occurrence of quantitatively based financial management firms and the development of high 
performance online systems which use methods of statistical inference from real-time and 
historical prices as well as other economic information. 

The behavior of a financial time series can be described by a generalized Wiener process. 
In terms of a drift rate /x, the evolution of a market index value or a stock price S(t) is 
governed by equation [4]: 

^ =/!* + «(«). (1) 

The value d£(t) is a noise added to the path followed by S(t) with the expectation value 
and variance of 

E[d£(t)] = 0, (2) 

Var[df (*)] = a{t) 2 dt. (3) 

The volatility a(t) represents a generic measure of the magnitude of market fluctuations. 
It quantifies risk and enters as an input to option pricing models. We consider a discrete 
random walk and set dt = U — iVi, Si = S(ti), & = d£(ti), and <7j = cr(ti). 

The random walk model proposed by Bachelier in the year 1900 [1] is equivalent to the 
Gaussian multivariate probability density function (PDF) of the increments C,f. 

G»(« = ^exp(-it^. 2 )ri^ <«) 
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The statistically significant correlations of the increments are absent from a time frame 
greater than t; — = 20 Min [5]. The absence of the correlations, 



has been widely documented (see e.g. Ref. [11]) and is often cited as a support for the efficient 
market hypothesis [9]. The model (4) constitutes a solid zero-order approximation to the 
empirical distributions of the increments £j. The multivariate PDFs constructed in this work 
are extensions of the Gaussian PDFs, aimed to model the well-established deviations in the 
behavior of financial time series from the Gaussian random walk. 

The Levy stable truncated univariate PDFs [2,10,11] are known to provide, for a finan- 
cial time series, (i) an approximate scaling invariance of the univariate PDFs with a slow 
convergence to the Gaussian behavior and (ii) the existence of heavy tails. We propose the 
multivariate Student PDFs, 



for modeling the empirical PDFs with increments It is not difficult to verify that the 
marginal PDF (6) is again PDF (6). If we integrate out all of the & except for one, we get 
(6) with n — 1. The tails of the distributions behave empirically like [5] ~ c?£/£ 4 , and so 
a ~ 3. In Fig. 1, we compare the n — 1, a — 3 Student PDF with the Gaussian PDF and 
with the empirical univariate PDF constructed for the 100+ years of the daily prices of the 
Dow Jones 30 Industrial Average (DJIA), starting on the May 26, 1896 and ending on the 
December 31, 1999 (i.e. a total of 28507 trading days). The drift rate was determined to 
be n = 0.000257, while Var[£] = 0.000117. The PDFs are shown as functions of the £/a 
with a 2 = Var[£]. The Student PDF fits the empirical data very well. The common scale 
parameter is derived from Eq.(8) using the empirical value of Var[£]. 
For the Student PDF (6), we have 



Corr = 5, 



(5) 





E&] = 0, 



(7) 



Var[&] = uj\ 



a 



(8) 



3 



W g+1 \2(-n( a\-n( a-2r \ _ p/ a-r \2\ 
n nrr \\C \ r \C |H _ 1 I 2 / V V 2 / V 2 / 1 I 2 ^ /q\ 

u ^ l,iaN 0Fr(^ti)r(f)r(^)-r(^)2r(^)2- iyj 

The correlation of the increments is the same as in the Gaussian random walk and given by 
Eq.(5). According to Eqs.(2), (3), and (7), (8), the square of the volatility equals 

0}=^- do) 

The correlation function (9) is positive and does not depend on the lag I = i—j. Notice that 
Corr[|£j| , |^|] = 2/(2 + ir) = 0. 39, while the empirical value is two times lower (see below). 
The correlation function diverges at r > | due to the heavy tails. 

If all components of the vectors ip = (ipi, ...,ip n ) and rj = (rji, ...,r) a ) are normally dis- 
tributed ~ N(0, 1), a random vector with components 

OL l 

& = (Vilpil — 15 (11) 
I] 2 

and 7] = \rj\ has the PDF (6) (see e.g. [12]). The cumulative effect, £ = X)r=i£»> * s described 
by 

dW(0 = I dm ~ = S^)§ (12) 

i=l i=l 

where fl 2 = Y17=i u i- ^ ne va l ue £ can be represented as £ = (cuiipi + U21P2 + ■■■+uj n ip 

n ) I ~~ 2 I ^ 

f^|4| 3 , where ~ JV(0, 1), and so Eq.(12) easily follows. The variance of the £ increases 
linearly with n, in agreement with the empirical observations. Eq.(12) represents the exact 
scaling law for the financial time series, observed empirically by Mandelbrot [3] and discussed 
by many authors [5,10,11]. 

The Student conditional PDF at n 3> I = n — k that gives a forecast density / steps 
ahead has the form 

where 

k 



,2 _ 

k 



i=i 



The conditional PDF of the values £ n , ...,£fc+i is therefore close to the Gaussian PDF. The 
ARCH models [16] propose that the increments are distributed as ~ N(0,v 2 ), with the 
volatility v being a function of the lagged increments. The estimator (14) is one of the 
possible estimators quantifying the volatility v within the ARCH framework. 

The multivariate Student PDFs have therefore heavy tails and the exact scaling invari- 
ance from the start. The PDFs (6) are apparently a reasonable starting approximation for a 
precise modeling the empirical PDFs. These distributions can be modified further to describe 
two other well well-established stylized facts which are (iii) long ranged volatility-volatility 
correlations that are also known as volatility clustering [13] and (iv) return-volatility corre- 
lations that are also known as leverage effect [14,15]. 

The empirical facts show that there is a slow decay of the correlation function. An 
extension of the PDF (6) that has the value Corr[|£j| r , |^| r ] which is decaying with time 
is rather straightforward. From the representation (11) it is clear that the long ranged 
correlations occur, since the denominator ^Jr] 2 /a is common to all of the increments £j. In 
order to provide a decay of the correlations, it is sufficient to use different 77's for different 
groups of the ip^s. The analogy with the Ising model can be useful: The components of 
the random vector ip with the same denominator y/rfja can be treated as domains of spins 
aligned in the same direction. We assign the usual probability for every such configuration: 

n-l 

w[a 1 ,...,a n \(3] = N exp(-/3^o-;0- m ) (15) 

i=i 

where Oi = ±1. The normalization constant is given by 1/N = 2(2 cosh(/3)) n_1 . The cor- 
relation of the absolute values of the increments equals (9) provided that and belong 
to the same domain. Otherwise the result is zero. The probability of getting the & and 
within the same domain can be found to be 

wi = e" 7/ (16) 

where e -7 = e~ /3 /(e /3 + e" 13 ) < land I = i — k. The coefficient Corr[|£j| r , |£fc| r ] for the 
modified PDF has therefore the form of Eq.(9) multiplied by wi. To fit the empirical data, 
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we use a superposition of (16) with different values of the 7. The absence of the correlations 
would formally correspond to (5 — +00 (7 = +00). This is the case when the multivariate 
PDF is a product of the n univariate PDFs. 

In order to incorporate leverage effect, we consider the PDFs (6) with uji depending 
on the signs e;_ p = sign(£;_ p ) of the lagged increments (p = 1,2, ...). The values = ±1 
are assumed to be independent variables which take the two values ±1 with the equal 
probabilities. The function Ui = x(ti-i, Q-2, •••) > is defined as 

c^ = c|l-p^f>^_ p j . (17) 

A requirement of < v < 1 should be imposed to ensure the convergence of the series. For 
< p < 1, we have < (1 — p) 2 < uoi < (1 +p) 2 . Negative recent returns e^ p — — 1 (p — 1, 2, 
...) increase the volatility, so p > 0. The normalization condition is taken to be E[uJi] = 1, 
so C = (1 + P 2 j^)" 2 - The value of p is connected to the overall strength of leverage effect. 
Taking volatility clustering and leverage effect into account, we obtain: 

Cor I K,\,M = - {1 -' )9 ' W ' + ! {9 '- 1) , (18) 

Corr[|&| = « )Wl + h t (19) 

71 \J{go - £)go 

where g% = E[uiUk], hi = E^e^u^], and / = i — k > 0. We have 

a + f—f m = a + f— r + v_ + ^ (1+ [ ){1 >^ , (20) 

(l + p 2 T —)h l = -2 P —-u l . (21) 

The empirical correlation functions for the daily prices of the DJIA are fitted using a 
superposition wi = c\e~ lxl + c 2 e -72 ' for I > with c\ = 0.18, c 2 = 0.08, 71 = 1/1200, and 
72 = 1/233 (else c = 0.74 and 70 = +00, so that Ylm=o ° m = ■*■)■ ^ ne va l ues (15) can be 
interpreted as conditional probabilities, while the c m are probabilities to get the j m . We 
have also used p — 1 and z/ = exp(— 1/16). The results shown in Fig. 2 are in a very good 
agreement with the data. The value Corr[£j, |^|] vanishes at % > k in agreement with the 
observations, since the Ui depend on lagged increments only. 
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Finally, the multivariate PDF of the increments is given by 

5 n(&» -,£l)f = Xl Cm S (2cOsh(/3 m ))™- 1 ^ II S n f -n f -i (fn,-l, £n,_i) (22) 

where n s+ i = n + 1, n > n s > ... > rii > 2, and n = 1. The number of terms entering 
Eq.(22) and therefore the calculation time increases exponentialy with n. Using a Pentium 
IV Microprocessor, the straightforward calculation of the n — 13 PDF (22) takes ~ 10 sec. 
The marginal probability of the PDF (22) is a PDF (22) again, but for a smaller sequence 
of the increments: 



/ 



S:(U..,Zi)f J! <%i = Sk(tk,..,ti)F. (23) 

i=k+l 

The value of c = 0.74 turned out to be unexpectedly large. It indicates that approxi- 
mately 75% of the empirical PDF consists of the product of the univariate Student PDFs. 
The second moments of & are finite, so the product does converge to the Gaussian PDF due 
to the central limit theorem [12]. One should thus verify that the PDF (22) nevertheless 
supports the approximate scaling observed for financial time series [3,5,11]. In Fig. 3, we 
plot the scaled PDFs of the values £ = J2i=i£i f° r n = 1)2,3,5,8, and 13 as functions 
of £/Var[£]2, together with the scaled Gaussian PDF. We see that the convergence to the 
Gaussian PDF is very slow. A qualitative explanation of this phenomenon is based on the 
fact that the Levy stable PDF fits the central part of the empirical distributions [5,10] and 
therefore the central part of the Student PDF (see Fig. 1), providing an exceptionally slow 
convergence to the Gaussian PDF. In Fig. 4, we compare the scaled n — 13 distributions 
based upon the PDF (22) with the empirical n — 13 PDF. The results are in excellent 
agreement, whereas the n = 1 Student PDF and the Gaussian PDF disagree clearly with 
the data. 

The PDF (22) is constructed for the daily increments. There are no a priori reasons to 
believe that the daily scale is more appropriate than other time scales. The criterion for 
selection of the best time interval is the quality of the fit of the empirical univariate PDF. It 
is seen from Fig. 4 that e.g. At = 13 days would be an inaccurate choice. At the same time, 



the excellent fit of the daily empirical PDF, shown in Fig. 1, indicates that the assumption 
At = 1 day on which the PDF (22) is based is quite reasonable. 

If we work with an ensemble of uncorrected stocks and calculate ensemble averages, we 
should use the marginal probabilities. If we work with a single index, we only calculate the 
time averages. In such a case, the conditional probabilities should be used. The previous 
discussion refers to the calculation of the marginal probabilities. If no correlations exist, 
as in the case of the Gaussian random walk, the marginal probabilities coincide with the 
conditional probabilities. 

We now present some results for the time averages represented by the conditional proba- 
bilities. They are distinct from the marginal ones, since the absolute values of the increments 
£i are correlated. In our case, the ergodic hypothesis is valid, since all of the values 7& and 
ln(l/z/) are distinct from zero, so the correlation lengths are all finite. This means that the 
time averages are equivalent to the ensemble averages, provided that the time scale used 
for the estimation is much longer than the largest correlation length (I/71 ~ 5 calendar 
years). Using smaller time scales, however, the distinction between these two types of the 
probabilities can be significant. 

The conditional PDF of the increments £ n , ...,£ fe+1 |£ fc , —1^1 equals 

SniCn, ...,£fc+i|£fc, = cff"' — TT- (24) 

&k\^k, —,<;i)f 

It is normalized to unity according to Eq.(23). Note that for i > k 

E&|&,-,£i]f = 0, (25) 

exp(-p m (i-v- x)) 



Var[&|6,...,6]F = 2-( l - fc ) J2 E C ™E 



k 



2 a 



a — 2 + K\ a 



; „i (2<=o S h(/j m ))- + " 



where x = min(l,t> — 1), and x x = max(0, k + 1 — v). The conditional volatility, according 
to Eqs.(2) and (3), equals 

4-i = Var[&|&-i,...,£i]F, (27) 



while the volatility (10) can be referred to as the marginal (unconditional) volatility. The 
ensemble average of (27) provides (10). 

The value Aj = ka.Si/Si-i determines the log-return on an asset per interval [U, U-i\. The 
increments ^ ~ 10~ 2 and the drift rate \i are small, and so A, ~ ^ to a first approximation. 
The Ito's lemma (see e.g. [4]) can be applied to hi S(t) where S(t) is governed by Eq.(l) 
provided that E[(d£(i)) n ] < oo for all n. In our case, the asymptotic distributions of the 
increments are such that E[(d£(t)) 2n ] = oo for n > 2. Eq.(l) is therefore equivalent to 
equation 

A< = ln(l + /I + &) (28) 

that cannot be simplified further by expanding the logarithm in the power series of the £j and 
discarding the averages of the higher order terms in £j. 

The multivariate PDF of log-returns can be written as <iW^[Aj] = S%(£i)F nT=i where 
£i is defined directly by Eq.(28). The knowledge of the multivariate PDF provides, in prin- 
ciple, the most complete information on the future market behavior. 

To make a simple check, let us consider the fair value of a stock or a market index at 
t = tk- 

n 

E[S n ] = S k E[exp( A0|&,...,£i]f = (l + r) l S k 

i=k+l 

where /i — r is the risk-free discount rate corresponding to the expiration date t n . The result 
has the conventional form. The dispersion of S n , however, diverges: Var^l^, ...,^i]f — °o- 
One can probably assume the existence of a cutoff in the PDFs at ~ A. The largest 
one-day variation of the DJIA took place on the October 19, 1987, so the value of A is 
constrained from below by A > 0.25. 

To illustrate possible applications of the model, let us consider pricing of the log contracts 
[17]. The fair value of a long position, L — In S n / S k) is defined as the conditional expectation 
value: 

n 

E[L] = E[£ A^ k ,...,^] F . 

i=k+l 
9 



The integrals are determined by the region \^\ ~ 10 2 . The values A, can be expanded in the 
power series up to 0(£?), in which case the integrals are well defined. Using the conditional 
PDF (24), we obtain: 



The result depends on the lagged increments through of fc = Var[£j|£ fe , given by 
Eq.(26). The pricing of log contracts is well defined both with respect to the fair value 
and the dispersion. 

It is hard to distinguish empirically between the increments & and log-returns Aj. The 
small distinction results in particular to small return-return correlations due to the non- 
vanishing correlations of of and £ k . This effect is, however, statistically not significant for 
empirical tests. The asymptotic dW ~ d£/£ 4 ~ e~ 3A <iA implies E[S n ] < oo,and Var^] < 
oo, while dW ~ c?A/A 4 ~ d£/(£ln 4 £) would cause severe theoretical problems: E[S n ] = 
Vai[S n ] = oo. The asymptotic behavior ~ d£/£ 4 refers apparently to the increments 

The asymptotic behavior of the return distribution is important for correct estimates of 
financial risks. The Gaussian walk underestimates the probability of large fluctuations, and 
in particularly, the probability of crushes. The use of the multivariate Student PDF (22) 
gives a more precise idea on the involved risks, and in particular, on those connected to large 
market fluctuations. 

The following four stylized facts (i) heavy tails, (ii) scaling of the returns, (iii) volatility 
clustering and (iv) leverage effect are well established empirically, but missing in the Gaus- 
sian random walk model. We constructed the multivariate PDFs of incremenets and returns 
for financial markets, that take approximately those four effects into account. This model 
can be useful for more accurately pricing derivative securities and risk management. 

The authors wish to thank the Dow Jones Global Indexes for providing the DJIA his- 
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present publication. M.I.K. is grateful to Metronome-Ricerca sui Mercati Finanziari for 
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FIGURES 




FIG. 1. The a = 3 Student and Gaussian univariate PDFs as compared to the hystogram 
for the 100+ years of daily prices of the Dow Jones 30 Industrial Aveage. The value of £ is the 
increment (noice) added to the path followed by the index value, a 1 = Var[£]. The common scale 
of the both distributions is fixed by fitting the variance. The solid line describes the Student PDF, 
the short-dashed line corresponds to the Gaussian PDF. 



14 



0.15 



0.10 



0.05 



0.00 



-0.05 



-0.10 




10° 



2 3 4 5 6 78-|q1 2 3 4 5 6 7 8 1n 2 2 3 4 5 6 7 8 1n 3 



10' 



3 10 J 



Days 



FIG. 2. Correlation functions Corr[|£j|, and Corr[|£j|, versus time lag I = i — k between 
two increments £j and The empirical data are constructed using the daily prices of the DJIA. 
The volatility clustering (circles) and the leverage effect (boxes) can be clearly seen. The data is 
compared to predictions of the multivariate Student PDF, shown by the solid lines. 
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FIG. 3. The a = 3 scaled Student PDF for n = 1, 2, 3, 5, 8, and 13 (solid lines) as compared to 
the Gaussian PDF (short-dashed line). The lower vales of the n correspond to the higher values of 
the PDF at the origin £ = and at high absolute values of the £. The convergence to the Gaussian 
PDF with increasing the n is slow. The value of £ is the noise added to the path followed by the 
index and a 2 is variance of the £. 
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FIG. 4. The a = 3 n = 13 and n = 1 Student and Gaussian PDFs as compared to the 
hystogram for the 100+ years of prices of the DJIA with sampling At = 13 days. The value of £ 
is the increment (noice) added to the path followed by the index value, a 2 = Var[£]. The common 
scale of the distributions is fixed by fitting the variance of the £. The solid line stands for the n = 13 
Student PDF, the long-dashed line stands for the n = 1 Student PDF, and the short-dashed line 
denotes the Gaussian PDF. 
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